Guide for setting up AWS
This guide will walk you through setting up your AWS account to work with DataStori. 🚀
DataStori runs data pipelines in your environment using AWS Fargate and securely integrates with your account using a cross-account IAM role.
Prerequisites​
Before you begin the setup, please have the following information and resources ready in your AWS account.
📋 Resource Checklist​
- Networking
- VPC ID: The ID of the Virtual Private Cloud for running pipelines.
- Subnet IDs: A list of subnet IDs where the pipelines will run.
- Security Group IDs: A list of security group IDs to apply to the pipeline containers.
- Services
- ECS Cluster: Create a new AWS Fargate ECS cluster and note its ARN.
- S3 Bucket: The name of the S3 bucket where pipeline data will be stored.
- S3 Bucket Region: The AWS region where your S3 bucket is located (e.g.,
us-east-1). - RDBMS (Optional): Connection details for any relational database you plan to use.
IAM Configuration Steps​
You'll need to create two IAM roles and two IAM policies to grant DataStori the necessary permissions.
Step 1: Create a log group IAM policy.​
This policy allows Fargate container to create and write logs.
- Navigate to IAM -> Policies and click Create policy.
- Switch to the JSON tab and paste the following code.
- Name the policy "log_group_creation_policy"
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:DescribeLogStreams"
],
"Resource": "*"
}
]
}
- Click save/create and exit
Step 2: Create the ECS Task Execution Role​
This role allows the Fargate container to pull images and write logs.
- Navigate to IAM -> Roles and click Create role.
- For the trusted entity, select AWS service, and for the use case, choose Elastic Container Service.
- Select the Elastic Container Service Task use case and click Next.
- On the permissions page, the
AmazonECSTaskExecutionRolePolicywill be attached by default. Also attach the AmazonS3FullAccess policy and the "log_group_creation_policy" created in Step #1 and Click Next. - Name the role
datastori-ecs-task-execution-roleand click Create role. - Once created, find the role and copy its ARN. You will need this for the next step.

Step 3: Create the DataStori Management Policy​
This policy defines the specific actions DataStori is allowed to perform, like starting and stopping pipeline tasks.
- Navigate to IAM -> Policies and click Create policy.
- Switch to the JSON tab and paste the following code.
- Important: Replace
<YOUR_AWS_ACCOUNT_ID>with your actual 12-digit AWS Account ID.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "RunAndInspectTasks",
"Effect": "Allow",
"Action": [
"ecs:RunTask",
"ecs:StopTask",
"ecs:DescribeTasks",
"ecs:ListTasks",
"ecs:DescribeTaskDefinition"
],
"Resource": "*"
},
{
"Sid": "RegisterAndCleanupTaskDefinitions",
"Effect": "Allow",
"Action": [
"ecs:RegisterTaskDefinition",
"ecs:DeregisterTaskDefinition"
],
"Resource": "*"
},
{
"Sid": "PassSingleRoleToECSTasks",
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": "arn:aws:iam::<YOUR_AWS_ACCOUNT_ID>:role/datastori-ecs-task-execution-role",
"Condition": {
"StringEquals": {
"iam:PassedToService": "ecs-tasks.amazonaws.com"
}
}
}
]
}
- Click Next, give the policy the name
DataStori-ECSTaskManage-Policy, and click Create policy.
Step 4: Create the Cross-Account Role for DataStori​
This final role trusts DataStori's AWS account and uses the policy you just created to grant permissions.
- Navigate to IAM -> Roles and click Create role.
- For the trusted entity type, select AWS account and choose Another AWS account.
- Enter the Account ID for DataStori. (Please ask DataStori customer support for this ID).
- Click Next.
- On the permissions page, search for and select the
DataStori-ECSTaskManage-Policyyou created in Step 2. - Click Next.
- Name the role
datastori-roleand click Create role. - Once created, find the role and copy its ARN.


Logging (Optional)​
By default, DataStori will write the pipeline logs to CloudWatch. If you want to customize the logging destination, please share the ARN of the CloudWatch log group.
Final Summary​
Please provide the following information to the DataStori team to complete the setup.
- Your AWS Account ID:
123456789012 - VPC ID:
vpc-0123abcd - Subnet IDs:
subnet-abcde123, subnet-fghij456 - Security Group IDs:
sg-5678efgh - S3 Bucket Name:
your-datastori-bucket - S3 Bucket Region:
us-east-1 - ECS Cluster ARN:
arn:aws:ecs:region:account-id:cluster/YourClusterName - Task Execution Role ARN:
arn:aws:iam::account-id:role/datastori-ecs-task-execution-role - DataStori Cross-Account Role ARN:
arn:aws:iam::account-id:role/datastori-role - CloudWatch Log Group ARN (Optional): If you have a preferred logging destination.